Segatron: Segment-Aware Transformer for Language Modeling and Understanding

نویسندگان

چکیده

Transformers are powerful for sequence modeling. Nearly all state-of-the-art language models and pre-trained based on the Transformer architecture. However, it distinguishes sequential tokens only with token position index. We hypothesize that better contextual representations can be generated from richer positional information. To verify this, we propose a segment-aware (Segatron), by replacing original encoding combined of paragraph, sentence, token. first introduce mechanism to Transformer-XL, which is popular Transformer-based model memory extension relative encoding. find our method further improve Transformer-XL base large model, achieving 17.1 perplexity WikiText-103 dataset. investigate pre-training masked modeling task Segatron. Experimental results show BERT Segatron (SegaBERT) outperform vanilla various NLP tasks, outperforms RoBERTa zero-shot sentence representation learning. Our code available GitHub.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Knowledge-Aware Natural Language Understanding

Natural Language Understanding (NLU) systems need to encode human generated text (or speech) and reason over it at a deep semantic level. Any NLU system typically involves two main components: The first is an encoder, which composes words (or other basic linguistic units) within the input utterances to compute encoded representations, that are then used as features in the second component, a pr...

متن کامل

A statistical segment-based approach for spoken language understanding

In this paper we propose an algorithm to learn statistical language understanding models from a corpus of unaligned pairs of sentences and their corresponding semantic representation. Specifically, it allows to automatically map variablelength word segments with their corresponding semantic units and thus, the decoding of user utterances to their corresponding meanings. In this way we avoid the...

متن کامل

Context-aware Spoken Language Understanding for Human Robot Interaction

English. Robots operate in specific environments and the correct interpretation of linguistic interactions depends on physical, cognitive and language-dependent aspects triggered by the environment. In this work, we present LU4R adaptive spoken Language Understanding 4 Robots, a Spoken Language Understanding chain for the semantic interpretation of robotic commands, that is sensitive to the ope...

متن کامل

social understanding and language

social interaction requires the ability to understand people’s mental states (their intentions, desires and believes). the ability for attributing mental states is called “theory of mind.” theory of mind leads the person to explain and predict behavior. researchers showed a great interest in the relation of language and mind/language and thought. some researchers focused on methodological conce...

متن کامل

Low-Rank RNN Adaptation for Context-Aware Language Modeling

A context-aware language model uses location, user and/or domain metadata (context) to adapt its predictions. In neural language models, context information is typically represented as an embedding and it is given to the RNN as an additional input, which has been shown to be useful in many applications. We introduce a more powerful mechanism for using context to adapt an RNN by letting the cont...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i14.17485